Using Lightweight Procedures to Improve Instruction Cache Performance

نویسندگان

  • Krishna Kunchithapadam
  • James R. Larus
چکیده

Instruction cache performance is widely recognized as a critical component of the overall performance of a program; especially so in the case of large applications like database servers. In this report, we present a technique for (1) identifying repeated blocks of instructions in a program executable, and (2) converting these repeated code blocks into lightweight procedures (i.e. LWprocs). The use of LWprocs reduces the static code size of a program, and can potentially reduce the working set size of the process, at the cost of increasing its dynamic instruction count. However, the tradeoo seems to be in favor of the reduction in working set size for most programs. Even with a simple model of program structure and a straightforward technique for generating LWprocs, we nd performance improvements between 3% to 9% for programs in the SPECINT95 suite. However, the technique sometimes leads to slowdowns (between 5% and 27%) for some programs, suggesting that lightweight procedures should be used with care.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design and Implementation of a Lightweight Dynamic Optimization System

Many opportunities exist to improve micro-architectural performance due to performance events that are difficult to optimize at static compile time. Cache misses and branch mis-prediction patterns may vary for different micro-architectures using different inputs. Dynamic optimization provides an approach to address these and other performance events at runtime. This paper describes a software s...

متن کامل

Temporal-Based Procedure Reordering for Improved Instruction Cache Performance

As the gap between memory and processor performance continues to grow, it becomes increasingly important to exploit cache memory effectively. Both hardware and software techniques can be used to better utilize the cache. Hardware solutions focus on organization, while most software solutions investigate how to best layout a program on the available memory space. In this paper we present a new l...

متن کامل

Combining Instruction Prefetching with Partial Cache Locking to Improve WCET in Real-Time Systems

Caches play an important role in embedded systems to bridge the performance gap between fast processor and slow memory. And prefetching mechanisms are proposed to further improve the cache performance. While in real-time systems, the application of caches complicates the Worst-Case Execution Time (WCET) analysis due to its unpredictable behavior. Modern embedded processors often equip locking m...

متن کامل

Analysis of Temporal-Based Program Behavior for Improved Instruction Cache Performance

ÐIn this paper, we examine temporal-based program interaction in order to improve layout by reducing the probability that program units will conflict in an instruction cache. In that context, we present two profile-guided procedure reordering algorithms. Both techniques use cache line coloring to arrive at a final program layout and target the elimination of first generation cache conflicts (i....

متن کامل

Eecient Procedure Mapping Using Cache Line Coloring

As the gap between memory and processor performance continues to widen, it becomes increasingly important to exploit cache memory e ectively. Both hardware and software approaches can be explored to optimize cache performance. Hardware designers focus on cache organization issues, including replacement policy, associativity, line size and the resulting cache access time. Software writers use va...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999